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The most critical component of any adaptive numerical quadrature routine is the estimation 
of the integration error. Since the publication of the first algorithms in the 1960s, many error 
estimation schemes have been presented, evaluated and discussed. This paper presents a review 
of existing error estimation techniques and discusses their differences and their common features. 
Some common shortcomings of these algorithms are discussed and a new general error estimation 
technique is presented. 

Categories and Subject Descriptors: F.2.1 [Numerical Analysis]: Numerical Algorithms and 
Problems — Computations on polynomials; G.1.0 [Numerical Analysis]: General — Error analy- 
sis; Numerical algorithms; G.1.0 [Numerical Analysis]: Interpolation — Interpolation formulas; 
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quadrature; Error analysis 

General Terms: Algorithms, Reliability 

Additional Key Words and Phrases: Numerical integration, Adaptive quadrature, Error estimation 



1. INTRODUCTION 

Adaptive quadrature, or adaptive numerical integration, refers to the process of 
approximating the integral of a given function to a specified precision by adap- 
tively subdividing the integration interval into smaller sub-intervals over which a 
set of local quadrature rules are applied. Since the publication of the first adap- 
tive quadrature routines almost 50 years ago [Morrin 1955; Villars 1956; Kuncir 
1962], more than 20 distinct algorithms have been published, along several papers 
dedicated to their analysis [Casalctto ct al. 1969; Hillstrom 1970; Kahaner 1971; 
Malcolm and Simpson 1975; Robinson 1979; Krommcr and Uberhuber 1998] and 
even on methodologies for their analysis [Lyness and Kaganovc 1977]. 



Algorithm 1 integrate (/, a, b, r) 



1: Q n [a,b) « j a f(x)dx 
e w Q„[o,6] - J a f(x)dx 
if e < t then 

return Q n [a, b] 
else 

m <- (a + b)/2 

return integrate(/, a, m, r') + integrate(/, m, b, r') 
end if 



Many recursive adaptive quadrature routines follow the general scheme detailed 
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in Algorithm 1. In Line 1 an approximation Q n [a, b] to the integral of f(x) over n 
points in the interval [a, b] is computed and in Line 2 the error of this approximation 
is estimated. If this error is less than some user-specified local tolerance r the 
algorithm returns the approximation Q n [a,b]. If the error is deemed too large, 
the interval is subdivided (in this example bisection is used) and the integration 
algorithm is applied recursively on both intervals separately for some new, adjusted 
tolerance r'. 

In the following, we will use Q n [a, b] to denote a generic interpolatory quadrature 
rule over n points in the interval [a, 6]. For specific or well-known quadrature 
rules, we will use specific symbols such as NC„ [a, b] for Newton-Cotes, CC n [a, b] for 
Clenshaw-Curtis and G„[a, b] and GK„[a, b] for Gauss and Gauss-Kronrod rules over 
7i points respectively. We will use the notation Q„ ' [a, b] to denote the quadrature 
rule Q„ applied on m panels of equal size in [a, b}. In [Davis and Rabinowitz 1967] 
Qn™^ [a, b] is referred to as a compound or composite quadrature rule. We will call 
m the multiplicity of Qn [a, b] . 

A slightly different approach to Algorithm 1, motivated by the desire for a sharper 
global error estimate and better interval selection criteria — and partially due to the 
unavailability of recursion in early computer programming languages — is shown 
in Algorithm 2. In this non-recursive approach, a heap of intervals, sorted by their 
local error estimates, is maintained (Line 3). As long as the sum of the individual 
error estimates is larger than the required global tolerance r (Line 4), the interval 
at the top of the heap (i.e. the interval with the largest error estimate, Line 5) is 
subdivided (Line 6). The resulting subintervals are evaluated (Lines 7 to 10) and 
returned to the heap (Lines 13 and 14), and the global integral and global error 
estimate are updated (Lines 11 and 12). 



Algorithm 2 integrate (/, a, b, r) 



1: H-Q n [a,b]nf°f{x)dx 

e <- s « Q„ [a, b] - j b a f(x) dx 

initialize heap H with interval 
while e > r do 

k <— index of interval with largest e% in H 
m <- (a k +b k )/2 

/left « f™ f(x) dx 
/right ~ /J f(x) dx 



b], integral Q n [a, b] and error Eq 



£\eft K 

£ right 



Q n [a k ,m] - f™f(x) dx 

b. 



Qn[m,b k } - J J f{x) dx 

I <— I — Ik + ^left + /right 
E <— e — Ek + £| e f t + £ r j g ht 

push interval [a k , m] with integral /| e f t and error £| e f t onto H 
push interval [m, b k ] with integral / r i g ht and error e r i g ht onto H 

end while 

return / 
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If the integrand is Riemann intcgrablc and the error estimates are exact, both 
Algorithm 1 and Algorithm 2 will converge to the exact integral. It is therefore 
only failures in the estimation of the integration error that will cause the quadrature 
algorithms to fail. It is for this reason that inn this review, we will concentrate only 
on the error estimate 

e « Q n [o,6] - / f{x)dx . 

J a 

as it is computed in Line 2 of Algorithm 1 and Lines 2, 9 and 10 of Algorithm 2. 

We will distinguish between the local and global error of an adaptive quadrature 
routine. During adaptive integration, the interval is subdivided into sub-intervals 
[afc,6fc] with a < ak < bf, < b. This subdivision occurs either recursively (as in 
Line 7 of Algorithm 1) or explicitly (as in Lines 13-14 of Algorithm 2). The local 

error Ek of the kfi 1 interval [a,k, bk] and the global error e are defined as 



£k 



Q n [ak,bk 



f(x) dx 



and 



^ Qn[ak, bk] 



f(x)dx 



(1) 



The sum of the local errors forms an upper bound for the global error (e < J^ fc Sk)- 
We further distinguish between the absolute errors (1), the locally relative error 
and the globally relative local error 



.(ii-ei; 



Qn[afc, b k ] - J" * f(x) dx 



.(grel) 



Qn[ak,bk] 



f bk /(a 



dx 



dx 



(2) 



We also define the global relative error which is bounded by the sum of the globally 
relative local errors: 



£ = 



Efe Qn[a k , b k ] - J a f(x) dx\ 



< 



k 



i[a k ,b k ] - J a h f{x)dx 



II 



II 

The sum of the locally relative errors, however, form no such bound. 

In the following, we will often refer to the degree of a quadrature rule. A quadra- 
ture rule is of degree n when it integrates all polynomials of degree < n exactly, 
but not all polynomials of degree n + 1. This is synonymous with the precise degree 
of exactness as defined by Gautschi [2004] or the degree of accuracy as defined by 
Krommer and Uberhuber [1998]. If a quadrature rule is of degree n, then its order 
of accuracy as defined by Skeel and Keiper [1993], to which we will simply refer to 
as its order, is n + 1. 

The goal of this review is to analyze and compare different error estimation tech- 
niques qualitatively, similarly to the analysis by Laurie [1985]. We will start with 
an overview of the most significant contributions over the last 50 years. Following 
this analysis, we will present a new error estimator which overcomes most of the 
problems observed in previous error estimators. 

In the following two sections we will discuss existing linear (Section 2) and non- 
linear (Section 3) error estimation techniques 1 . In Section 4 a new error estimation 



1 For a more detailed review, see [Gonnet 2009a] . 
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technique is presented and its relation to previous error estimators is discussed. In 
Section 5 we will apply the discussed error estimators to a number of test functions 
to assess their performance. In Section 6 we discuss these results and try to interpret 
them qualitatively. 

2. LINEAR ERROR ESTIMATORS 

In this section we will look at a number of linear error estimators. We define a linear 
error estimator as an estimate computed from a linear combination of evaluations 
of the integrand. Such estimators can be quadrature-like rules, linear combinations 
or differences of quadrature rules or quantities computed using linear extrapolation 
techniques, e.g. the Romberg scheme. 

2.1 Early Error Estimators Based on Rules of Equal Degree 

There seems to be some confusion as to who actually published the first adaptive 
quadrature algorithm. Davis and Rabinowitz [1967] cite the works of Villars [1956], 
Henriksson [1961] and Kuncir (see Section 2.1). 

Although no explicit attribution is given, Hcnriksson's algorithm seems to be 
an unmodified ALGOL-implcmentation of the algorithm described by Villars which 
is, as the author himself states, only a slight modification of a routine developed 
by Morrin [1955, cited in Villars 1956] in 1955. These three algorithms are more 
reminiscent of ODE-solvers, integrating the function stepwise from left to right 
using Simpson's rule and adapting (doubling or halving) the step-size whenever an 
estimate converges or fails to do so. In doing so they effectively discard function 
evaluations and so lose information on the structure of the integrand. We will 
therefore not consider them to be "genuine" adaptive integrators. 

In 1962, Kuncir [1962] publishes the first adaptive quadrature routine 2 following 
the scheme in Algorithm 1 and using the locally relative local error estimate 

_ SW[a fc A.]-S( 2 >[a fc A] 
S( 2 )[a fc A] 

where S^fa^A] is Simpson's rule applied over the entire interval [a^A] and 
S( 2 )[afcA] is Simpson's rule applied on the sub- intervals [a, and [^2,6]. If 
the error estimate is below the required tolerance, the estimate S( 2 '[<ZfcA] is used 
as the local approximation to the integral. 

The error estimate is based on the assumption that if the estimate S( 2 )[<ZfcA] 
is a better approximation of the integral than SW^A], the difference between 
both estimates will be a good estimate of the difference between S^fa^A] and 
the actual integral. 

Replacing every evaluation of the integrand in the un-scaled error estimate (3) 
with an appropriate f(a + h) and expanding it in a Taylor expansion around a, as 

2 Although Kuncir predates McKeeman by about half a year, many publications [Espclid 2007; 
2002; 2004a; 2003; Berntsen and Espelid 1991; Malcolm and Simpson 1975], credit McKeeman 
with having published the first adaptive integrator. Interestingly enough, the very similar works 
of both Kuncir and McKeeman were both published in the same journal (Communications of the 
ACM) in the same year (1962) in different issues of the same volume (Volume 5), both edited by 
the same editor (J.H. Wegstein). This duplication of efforts does not seem to have been noticed 
at the time. 
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is done in [Gander 2006], we obtain 



Inserting the Taylor expansion into the actual error gives a similar result: 



a fc 46 080 



(4) 



(5) 



If we assume that f^\x) is more or less constant for a: G [afc, 6fc] and both (4) and 
(5) therefore have similar values for f^ 4 \^), then the error estimate is actually 15 
times larger than the actual integration error. This factor of 15 might seem large, 
but in practice it is a good guard against bad estimates when f^(x) is not constant 
for x G [a k ,b k ]. 

In the same year, McKeeman [1962] publishes a similar recursive algorithm (fol- 
lowing Algorithm 1, yet using trisection instead of bisection) using the globally 
relative local error estimate 



S^[a k ,b k ]~S^[a k ,b k _ 



(6) 



where I is an approximation to the global integral of the absolute value of f(x). 

Using the same analysis as in (4) , we can compute the ratio of the computed and 
exact errors and obtain 



SW[a,b]-fif(x)dx 



80, 



(7) 



i.e. the error is overestimated by a factor of 80 for sufficiently smooth 3 integrand. 

The use of a globally relative local error estimate is an important improvement. 
Besides forming a correct upper bound for the global error, it does not run into 
problems in sub- intervals where the integrand approaches 0, causing any locally 
relative error estimate to approach infinity. The use of an error relative to the global 
integral of the absolute value of the function is a good guard against cancellation 
or smearing [Henrici 1982] when summing-up the integrals over the sub-intervals. 

A year later, McKeeman and Tesler [1963] publish a non-recursive 4 version of 
of the integrator with a better local tolerance computation and shortly thereafter, 
McKeeman publishes another recursive adaptive integrator [McKeeman 1963] based 
on Newton-Cotes rules over a set of n points, where n is a user-defined parameter. 
In the same vein as the previous integrator, the following error estimate is used 



1 



NCWKAl-NC^VfcA] 



(8) 



3 In the following, we will use the rather loose expression "sufficiently smooth" when, for a quadra- 
ture rule of order n, the nth derivative of the integrand is sufficiently close to constant in the 
integration interval, such that the error estimate will not fail. 

4 Their algorithm is non-recursive in the sense that an explicit stack is maintained, analogous to 
the one generated in memory during recursion, and not as in the scheme presented in Algorithm 2 
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At every recursion level, the interval is subdivided into n — 1 panels and, if the 
tolerance is met, the value of l\ic£ n_1 ^ [a, b] is used as an approximation to the 
integral. 

Replacing the evaluations of the integrand f(a + h) by their Taylor expansions 
around a and inserting them into the ratio of the computed and exact error as in 
(7), we can see that for n = 3 (i.e. applying Simpson's rule), we overestimate the 
actual error by a factor of 15. For n = 4, this factor grows to 80, as observed for 
McKeeman's first integrator (see (7)). For n = 5 it is 4 095 and for n = 8, the 
maximum allowed in the algorithm, it is 5 764 800 (7 decimal digits!), making this 
a somewhat strict estimate both in theory and in practice. 

In 1969, Lyness [1969] publishes the first rigorous analysis of McKeeman's in- 
tegrator and implements a revised algorithm, SQUANK [Lyness 1970]. He suggests 
using the absolute local error instead of the globally relative local error, bisection 
instead of trisection and includes the resulting factor of 15 in the error estimate 5 : 



1 

k 15 



S^[a k ,b k ]-S^[a k ,b k ] (9) 



He further suggests using Romberg extrapolation to compute the five-node Newton- 
Cotes formula from the two Simpson's approximations 6 : 

NC^ [a, b] = — (l6S (2) [a, b] - S (1) [a, b}) . (10) 
15 V / 

This is a departure from previous methods, in which the error estimate and the 
integral approximation were of the same degree, making it impracticable to re- 
late the error estimate to the integral approximation without making additional 
assumptions on the smoothness of the integrand. 

In a 1975 paper, Malcolm and Simpson [1975] present a global version of SQUANK 
called SQUAGE (Simpson's Quadrature Used Adaptively Global Error) along the 
lines of Algorithm 2, and conclude that global adaptivity allows for better control 
of the error estimate . 

In 1977, Forsythe et al. [1977] publish the recursive quadrature routine QUANC8, 
which uses essentially the same basic error estimate as Lyness (9), yet using Newton- 
Cotes rules over 9 points, resulting in a scaling factor of 1023 instead of 15 (see (9)). 
Analogously to (10), the two quadrature rules are combined using Romberg extrap- 
olation to compute a 11th degree approximation which is used as the approximation 
to the integral 8 . 

The same approach, although effectively evaluated differently, was later re-used 



5 Note that McKeeman's original error estimate was off by a factor of 80 (see (7)). The factor of 
15 comes from using bisection instead of trisection. 

6 Interestingly enough, this was already suggested by Villars [1956] and implemented by Henriksson 
[1961], but apparently subsequently forgotten. 

7 In their paper, Malcolm and Simpson state (erroneously) that Lyness' SQUANK uses S^ 2 ) [a, 6] 
as its approximation to the integral and, as their results suggest, 5( 2 '[a, b] was also used in 
their implementation thereof. This omission, however, has no influence on their results or the 
conclusions they draw in their paper as they only consider the number of intervals generated by 
the global and local error estimates, and not the accuracy of the final result. 

8 This routine was integrated into MATLAB as quad8, albeit without the Romberg extrapolation, 
and has since been replaced by quadl as of Version 7, Release 14 [The Mathworks 2005]. 

ACM Computing Surveys, Vol. V, No. N, 20YY. 



ERROR ESTIMATION IN ADAPTIVE QUADRATURE • 7 



by Garribba et al. [1978] in 1978 in their integrator SNIFF for Gauss-Legcndrc 
quadrature rules. They do not use Romberg extrapolation to refine the approxi- 
mation of the integral, but the use the error estimate to guess the optimal width 
of the sub-intervals in each unconvcrgcd interval. 

Finally, in a 2001 paper, Gander and Gautschi [2001] present two recursive adap- 
tive quadrature routines. The first routine, adaptsim is quite similar to Lyness' 
SQUANK (see Section 2.1). It computes the approximations S^^a, b] and S^'fa,?)] 
and uses them to extrapolate NCg 1 ^ [a, b] as in (10). The globally relative local error 
estimate, however, is then computed as 



NCP[a k ,b k ]-SW[a k ,b k ] (11) 



where / is a rough approximation to the global integral computed over a set of 
random nodes. 



2.2 Finite-Difference Based Error Estimators 

In a 1967 paper, Gallaher [1967] presents a recursive adaptive quadrature routine 
based on the midpoint rule. In this algorithm, the interval is divided symmetri- 
cally into three sub-intervals with the width h c of the central sub-interval chosen 
randomly in h c e [\h k , \h k ] , h k = (b k - a k ). 

The integrand f(x) is evaluated at the center of each sub-interval and used to 
compute the midpoint rule therein. Since the error of the midpoint rule is propor- 
tional to the second derivative of f(x), the local integration error can be estimated 
by computing the second divided difference of /(x) over the three values f±, f% and 
/3 in the center of the sub-intervals. Instead of the difference formula, Gallaher 
uses the more compact approximation 

e = 14.6 \h 2/ 2 + h\ bk - a *- h < . ( i 2) 

In which the constant 14.6 is determined empirically. 

Similarly, Ninomiya [1980] presents a recursive adaptive quadrature routine based 
on closed Newton-Cotes rules. He uses rules with 2n + 1 nodes (results are given 
for 5, 7 and 9 points) and notes that these have an error of the form 

f b 

NCan-iM- / f(x)dx = K 2n+1 (b-a) 2n+1 f^(0, Ufab}. 

J a 

Instead of using the same quadrature rule on two or more sub-intervals to approx- 
imate the error as in Kuncir's and Lyness' error estimates, he adds two nodes in 
the center of the leftmost and the rightmost intervals. 

Using 5 + 2, 7 + 2 and 9 + 2 point stencils, he computes the error estimators, e.g. 

D 9+2 [ a , b] « 37(6 ~ a)U / (10) (O, £ G [a, 6], (13) 
J+ 1 ' J 3066102400- 7 v ' 

which approximate the scaled 2n + 1st derivative in the analytical error of the 
Newton- Cotes rules. 
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2.3 Coefficient-Based Error Estimators 

In 1969, O'Hara and Smith [1969] publish a recursive adaptive quadrature routine 
based on Clenshaw- Curtis quadrature rules [Clenshaw and Curtis I960]. Their 
algorithm uses a cascade of error estimates based on pairs of Newton-Cotes and 
Clenshaw-Curtis quadrature rules and the final error estimate is computed as 



32 



(6 2 - 9)(6 2 - 1) 



(14) 



where E" denotes a sum in which first and last terms are halved and where the 
flj and f r .i are the values of the integrand evaluated at the nodes of two 7-point 
Clenshaw-Curtis quadrature rules over the left and right halves of the interval 
respectively. These sums are the approximated Chebyshev coefficients eg of the 
integrand over the left and right half of the interval. 

The error estimate (14) is derived by O'Hara and Smith [1968] based on the error 
estimation used by Clenshaw and Curtis [I960]. They start by writing the error of 
a Clenshaw-Curtis quadrature rule over n + 1 nodes as 



/(xJdx-CC^M 



(b-a) 



16n 



-C n +2 + 



32n 



(n 2 -l)(n 2 -9) ^ (n 2 -9)(n 2 
where the c k are the exact Chebyshev coefficients of 



25) 



C„+4 + 



(15) 



f( x ) = ^2c k T k (x) 



fe=0 

where T k {x) is the fcth Chebyshev polynomial of the first kind. 

They note that for most regular functions, the first term in (15) is often larger 
than the sum of the following terms. 

They find that if they define the higher-order \c2i\, % > n + 1 in terms of |c„+2| 
using the recurrence relation \c%+2\ = ifn|ci | , then they can define K n for different 
n such that the first term of (15) dominates the series. For the 7-point Clenshaw- 
Curtis rule, this value is K§ = 0.12. If the relation |cj+2| < K n \ci\ holds, then the 
error is bounded by twice the first term of (15) 



f(x) dx 



CC%[a,b] 



<(b-a) 



32n 



(n 2 -l)(n 2 -9) 



\Cn+2\ 



However, we do not know c„ +2; yet since we assume that the magnitude of the 



coefficients decays, we can assume that |c„ + 2| < 



lie 

2 | ^ri 



and use 



Since 



\c n \ might be 11 accidentally smaW , they suggest, in [O'Hara and Smith 1968], as an 
error estimate 

£ = (b - a)— ^■max{|c„|,2^ Il |c Il _ 2 |,2i ; 5r 2 |c n _4|} • 



(16) 



'( n a-l)(n 2 -9) —~u-»i>—nr-»-* 

Oliver [1972] presents a similar doubly-adaptive Clenshaw-Curtis quadrature rou- 
tine using an extension of the error estimate of O'Hara and Smith (see Section 2.3) 
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Instead of assuming a constant K n such that |cj+2| < K n |cj| where the Cj are the 
Chcbyshev coefficients of the integrand, as do O'Hara and Smith, Oliver approxi- 
mates the smallest rate of decrease of the coefficients as 



K = max 



(17) 



where the c\ are the Chebyshev coefficients approximated over the nodes of the 
quadrature rule. 

He also pre-computes a number of convergence rates K n (cr), which are the rates 
of decay required such that, for n coefficients, a times the first term of the error 
expansion in (15) dominates the sum of the remaining terms. If K is less than any 
K n (a) for a = 2, 4, 8 or 16, then the error estimate 

e = <r{b-a)— — — max { K\c n \, K 2 \c n _ 2 \, AT 3 |c n _ 4 |} , (18) 

\n z — l)(n^ —9) 

which is consistent with (16) by O'Hara and Smith, is used. 

If e exceeds the required local tolerance r k , the computed rate of decrease K is 
compared to a pre-computed limit K*. This limit is defined by Oliver [1971] as the 
rate of decrease of the Chebyshev coefficients as of which it is preferable to subdivide 
the interval as opposed to doubling the order of the quadrature rule. Therefore, 
if K > if*, the interval is subdivided, otherwise the order of the Clenshaw-Curtis 
quadrature rule is doubled. 

Finally, Bcrntsen and Espclid [1991] present an error estimator based on se- 
quences of null rules. Introduced by Lyness [1965], a null rule of degree k is 

(k) 

defined as a set of weights u\ over the n nodes X{, i = 1 . . . n such that 

n f 
i—l K - 

i.e. the rule evaluates all polynomials of degree j < k to and the (fc+l) at monomial 
to some non-zero value. 

Bcrntsen and Espclid compute a sequence of orthonormaP null rules of decreasing 
degree n1™~ 2 \ which form an orthogonal basis S n - Applying 

the null rules to the integrand f(x) we obtain the interpolation coefficients ek = 
[a, b] = X)"=i u i k '' ' f( x i) °f the integrand f{x) onto S n such that 

. n — 1 

f(xi) = =* ^E 6 ^' i = l...n. (20) 

2^k=i w k k=0 



To avoid "phase effects" as described in [Lyness and Kaganove 1976], the coeffi- 
cients are then paired and the ratio of these pairs is computed 



r k = -^, E k = (e 2 2k + e 2 2k+1 ) 1/2 , fc = 0...n/2-l. (21) 



9 The null rules are normalized such that the norm of the coefficients is equal to the norm of the 
quadrature weights. 
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The largest of the last K ratios r max = max/j r k is taken as an estimate of the 
convergence rate of the coefficients. If this ratio is larger than 1 then the function 
is assumed to be ll non- asymptotic? in the interval and the largest E k is used as a 
local error estimate. 

If ?* max is below 1 yet still above some critical value r C riticai ; the function is assumed 
to be "weakly asymptotic? and the value of the next-highest coefficient E n /2+\ - 
and thus the local error — is estimated using 

Sk = 10r max £„/ 2 -i (22) 

Finally, if ?' max is below the critical ratio, then the function is assumed to be 
"strongly asymptotic" and the error is estimated using 

£fe = 10^7tical^axK/2-l. (23) 

where a > 1 is chosen to reflect, as Berntscn and Espelid state, "the degree of 
optimism we want to put into this algorithm? 

Berntsen and Espelid implement and test this error estimate using 21-point 
Gauss, Lobatto, Gauss-Kronrod and Clenshaw- Curtis quadrature rules as well as 
61-point Gauss and Gauss-Kronrod rules, and later in DQAINT [Espelid 1992], based 
on QUADPACK's QAG (see Section 2.4), using the Gauss, Gauss-Lobatto and Gauss- 
Kronrod rules over 21 nodes. This approach is then extended to Newton-Cotes rules 
of different degrees and tested against a number of different quadrature routines 
[Espelid 2002; 2003; 2004a; 2004b; 2007]. 

More recently, Balles and Trefethen [2004] and Pachon et al. [2009] use a similar 
approach in the Chebfun system, in which arbitrary functions are represented as 
single or piecewise interpolants over Chebyshev nodes. These interpolations are 
considered to be sufficiently accurate in each interval when the absolute values 
of the highest-degree coefficients drop below a given tolerance. The integral of 
these interpolants can then be computed using Clenshaw-Curtis quadrature over the 
interpolation nodes, resulting in an adaptive quadrature scheme of sorts, although 
this is not the only purpose of the Chebfun system. 

2.4 Gauss-Kronrod Based Error Estimators 

In 1973 both Patterson [1973] and Piessens [1973] publish adaptive quadrature 
routines based on Gauss quadrature rules and their Kronrod extensions [Kronrod 
1965]. 

Piessens' algorithm, which is the first to follow the scheme in Algorithm 2, uses 
an error estimate of the form 

e k = | G„ [a k , b k ] - K 2n +i [a k , b k ] | (24) 

where G„[a, b] is the n-point Gauss quadrature rule of degree 2n—l and K2 n +i[ct, b] 
is the 2n + 1 point Gauss-Kronrod extension of degree 3n + 1 which is in turn used 
as the approximation to the integral. This is also the error estimate currently used 
in Matlab's quadgk [Shampine 2008]. 

Patterson's integrator takes a different approach, starting with a 3-point Gauss 
quadrature rule and using the Kronrod scheme to successively extend it to 7, 15, 
31, 63, 127 and 255 nodes, resulting in quadrature rules of degree 5, 11, 23, 47, 95, 
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\G n [a,b]-K 2n +i[a,b]\ 



Fig. 1. The error measure (200 |G n [a, b] — K2 n +i[o.,b]\) 3 ' 2 (dashed line) plotted as a function of 
|G n [a, b] - K2n+l[a, 6]|- 

191 and 383 respectively, until the globally relative error estimate 

e k = \K n [a k ,b k } - K 2ti+1 [a fc ,6 fe ]|/|/| (25) 

where K„[a, b] is the Kronrod extension over n nodes and K2 ra +i[a, b] its extension 
over 2n + 1 nodes, is below the required tolerance. I is an initial approximation 
of the global integral generated by applying successive Kronrod extensions to the 
whole interval before subdividing. 

In 1983, the most widely-used "commercial strength" quadrature subroutine li- 
brary QUADPACK is published by Piessens et al. [1983]. The general adaptive 
quadrature subroutine QAG is an extension of Piessens' integrator, yet with a slight 
modification to the local error estimate 

e k = I k min j 1, ( 2 Q0 |G " [afc ' &fc] Z^S±^lM \ ** j (26 ) 



where the default value of n is 10 and the value 



h. 



K 2n+ i[a fe ,& fe ] 

fix) - 



bk - ak 



which is also evaluated using the K2 n +i[a, b] rule, is used, as described by Krommcr 
and Uberhubcr [1998], as "a measure for the smoothness of f on [a, &]". 

The error measure is best explained graphically, as is done in Piessens et al. (Fig. 1) 
The exponent | is determined experimentally and scales the error exponentially, 
with a break-even point at 1.25 x 10~ 6 which is approximately relative machine 
precision for IEEE 754 32-bit floating point arithmetic. The scaling makes the 
estimate increasingly pessimistic for error estimates larger than 1.25 x 10 -6 and 
increasingly optimistic for error estimates below that threshold. 
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This measure is further divided by y I k - Krommer and Uberhuber explain this 
as follows: 

"If this ratio is small, the difference between the two quadrature formulas 
is small compared to the variation of f on [<X,b]; i.e. , the discretization 
of f in the quadrature formulas G n and K2 n +i is fine with respect to its 
variation. In this case, K2„+i can indeed be expected to yield a better 
approximation for If than G n ." 

Unfortunately, no further analysis is given in either [Piessens et al. 1983] or [Krom- 
mer and Uberhuber 1998]. 

This local error estimate is re-used by Favati et al. [1991a], yet using pairs of 
"recursive monotone stable" (RMS) nested quadrature rules introduced by Favati 
et al. [1991b], which allow for function evaluations to be re-used after bisection, 
within a doubly- adaptive scheme. 

Hascgawa et al. [2007] extend this approach by choosing bisection over increasing 
the degree of the quadrature rule when the ratio of two successive error estimates 
is larger than an empirically determined constant (as is suggested by Venter and 
Laurie [2002], see Section 3.2). 

In 1984 Berntsen and Espclid [1984] suggest that instead of using the difference 
between a Gauss quadrature rule over n points and its Kronrod extension over 
2n+ 1 points, one could directly use a Gauss quadrature rule over 2n+ 1 points for 
the estimate of the integral. To estimate the error of this rule of degree 4n+ 1, they 
suggest removing one of the points and creating a new interpolatory quadrature 
rule Q2n[a, 6] of degree 2n — 1 over the remaining 2n points: 

£fe = | G 2n+ i [a k , b k ] - Ghn [a k ,h]\- (27) 

Since the degree of the rule Ghn [a, b] is the same as that of the Gauss quadrature 
rule G n [a, b] used by Piessens (see Section 2.4), the error estimate is for functions 
of up to the same algebraic degree of precision, yet the final estimate is n degrees 
higher: An + 1 for G2n+i[a,b] vs. 3n + 1 for r\2n+i[a, b\. A further advantage is 
the relative ease with which the weights of the rule Q2n[a, b] can be computed, as 
opposed to the effort required for the nodes and weights of the Kronrod extension. 

Finally, the second routine by Gander and Gautschi [2001] (see Section 2.1), 
adaptlob, uses a 4-point Gauss-Lobatto rule GL^'fa, b] and its 7-point Kronrod 
extension [a, b]. The globally relative local error is computed, analogously to 
(11), as 

e k = \GL^[a k ,b k ] - K^[a k ,b k ]\/\I\. (28) 

If the tolerance is met, the approximation K^[a, b] is used for the integral. 
2.5 Summary 

Summarizing, we can group the different linear error estimators in the following 
categories: 

(1) e~ Q^Ml-Qn^'M] : Er ror estimators based on the difference between 
two estimates of the same degree yet of different multiplicity [Kuncir 1962; 
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McKeeman 1962; McKccman and Tesler 1963; McKecman 1963; Lyness 1969; 
1970; Malcolm and Simpson 1975; Forsythe et al. 1977]. 

(2) e ~ |Q ni [a, b] — Q„ 2 [a, &]|: Error estimators based on the difference between 
two estimates of different degree [Patterson 1973; Piessens 1973; Piessens et al. 
1983; Hascgawa et al. 2007; Berntsen and Espelid 1984; Favati et al. 1991a; 
Gander and Gautschi 2001; O'Hara and Smith 1969]. 

(3) e ~ |/( Error estimators based on directly approximating the derivative 
in the analytic error term [Gallahcr 1967; Garribba et al. 1978; Ninomiya 1980]. 

(4) e ~ |c„|: Error estimators based on the estimate of the highest-degree coefficient 
of the function relative to some orthogonal base [O'Hara and Smith 1968; 1969; 
Oliver 1972; Berntsen and Espelid 1991; Espelid 1992; 2002; 2004a; 2004b; 
2007]. 

Already in 1985, Laurie [1985] shows that the first three categories are, in essence, 
identical. Consider Kuncir's error estimate (see Section 2.1, (3)) from the first 
category (without the relative scaling), which can be viewed as a 5-point "rule" (or 
linear functional) over the nodes used by S^^a, 6] and S^[a,b]. 

Since both approximations evaluate polynomials of up to degree 3 exactly, their 
difference will be, when applied to polynomials of up to degree 3, zero. When 
applied to a polynomial of degree 4 or higher, the estimates will, in all but patho- 
logical cases, differ. This is, up to a constant factor, exactly what the Ath divided 
difference over the same 5 nodes computes 10 . 

The same can be said of error estimates from the second category, such as the 
one used by Piessens (see Section 2.4) where the Gauss quadrature rule G n [a, b] 
integrates all polynomials of degree up to 2n — 1 exactly and its Kronrod extension 
K2n+i[a, b] integrates all polynomials of degree up to 3n + 1 exactly. Since the 
approximation computed by these rules differ only for polynomials of degree 2n 
and higher, the combined "rule" over the 2n + 1 points behaves just as the 2nth 
divided difference would. 

In these cases, the divided differences are unique 11 (i.e. the nth difference over 
n+ 1 points), as are the quadrature rules. They therefore differ only by a constant 
factor. As a consequence, the first and second categories, are both equivalent to 
the third category, in which the lowest degree derivative of the error expansion are 
approximated explicitly. 

In the fourth and final category we again find finite differences, namely in 
Berntsen and Espelid's null rules (see Section 2.3), in which the coefficients 
relative to an orthogonal base are computed (see (20)). The highest-degree coeffi- 
cient e„_i, computed with the (n — l) st null rule over n nodes is, as Berntsen and 



10 Note that this is also, up to a constant factor, the definition of a null-rule, as used by Berntsen 
and Espelid (see Section 2.3). Lyness [1965], who originally introduced the concept of null-rules, 
creates them explicitly from the difference of two quadrature rules, as is done in these error 
estimates implicitly. 

11 Not all error estimators in these categories, though, are identical up to a constant factor to 
the highest-degree divided differences over the same points. McKceman's error estimator (see 
Section 2.1), for instance, approximates a 4th divided difference over 7 points, which is neither 
unique nor of the highest-possible degree. The same can be said of Forsythe, Malcolm and Moler's 
QUANC8 (see Section 2.1) and Patterson's successive Kronrod extensions (see Section 2.4). 
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Espclid themselves note in [Berntsen and Espelid 1991], identical up to a constant 
factor to the (n — l) st divided difference over the same nodes. This value is com- 
bined with the (n — 2) nd divided difference (see (21)), itself identical only up to a 
linear factor and used as an error estimate. 

The same goes for the coefficients relative to any base computed over n points, 
such as the coefficients c; of the Chebyshev polynomials used by O'Hara and Smith 
(see Section 2.3) and Oliver (see Section 2.3). The "rule" used to compute the 
highest-degree coefficients ((14)) is identical up to a constant factor to the nth 
divided difference over the n + 1 nodes used. While O'Hara and Smith use the 
highest-degree coefficient directly, Oliver uses K 3 |c„_4| (see (17) and (18)), which 
is related (i.e. no longer identical up to a constant factor) to the (n — 4)th divided 
difference. 

We therefore establish that all linear error estimators presented in this section 
are equivalent in that they all use one or more divided difference approximations of 
the higher derivatives of the integrand. The quality of the error estimate therefore 
depends on the quality of these approximations. 

In these estimates, problems may occur when the difference between two esti- 
mates or the magnitude of the computed coefficients is accidentally small i.e. the 
approximations used to compute the error estimate are too imprecise, resulting in 
a false small error estimate. This is often the case near singularities and disconti- 
nuities where the assumptions on which the error estimate is based, e.g. continuity 
and/or smoothness, do not hold. 

3. NON-LINEAR ERROR ESTIMATORS 

In the previous section, we considered error estimators that used only linear com- 
binations of function values inside a single interval. In this section, we will consider 
methods that use function values or quadratures from one or more intervals or sub- 
intervals and which combine these values non-linearly to estimate the integration 
error. 

3.1 De Boor's CADRE Error Estimator 

In 1971, dc Boor [1971] publishes the integration subroutine CADRE. The algorithm, 
which follows the scheme in Algorithm 1, generates a Romberg T-table [Bauer et al. 
1963] with 

Ti t = T^_! + Tt ' i ~ 1 ~ Tt^zl (29) 
4' — 1 

in every interval. The entries in the T-table are used to decide whether to extend 
the table or bisect the interval 12 . After adding each £t\i row to the table, a decision 
is made using the ratios 

Ri = (30) 

as to whether the integrand is linear, sufficiently smooth, discontinuous, singular 
or noisy inside the interval. 



Thus making it the first doubly-adaptive quadrature algorithm known to the author. 
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If the integrand is assumed to be smooth (Rq = 4±0.15), the approximation Ti 
is returned for the smallest i < £ such that the error 



£k = (bk - a k ) 



1 



(31) 



is less than the required local tolerance. Otherwise, if a jump discontinuity is 
assumed (R = 2 ± 0.01), the error is assumed to be bounded by the absolute 
difference of the two previous lowest-degree estimates: 



IT, 



LO 



Tt 



£—1,01 



Finally, if the integrand is assumed to be singular (Rq £ (1,4) and is within 10% of 
the Ro from the previous level I — 1) and of the form f(x) = (x — £ l ) a g(x), where 
£ is near the edges of [afe,6fe] and a £ ( — 1, 1). If this is the case, Ro should be 
w 2 a+l and the T-Table is computed using 11 cautious extrapolation!'' by interleaving 
the normal updates in (29) with updates of the form 



TiA-l 



T(a-i — Tg-n-i 



(32) 



2 a+l - 1 

where necessary. The error estimate is computed as in the smooth case (31) or as 



£k = (bk - a k ) 



2 a+i - 1 



(33) 



depending on which column i is considered. 

The rationale for using the ratios Ri (30) is based on the observation that the 
error of each entry of the T-table is, for sufficiently smooth integrands, 



1 



b — a 

The ratio Ri can therefore be re-written as 



f(x)dx-T e ^K t (2-^y 



, 2i+2 



Ri 



( 2 -(e-2-i) 



2i+2 



22i+2 



l2i+2 



ii+1 



Ki (2-(^')) 2l+2 - m (2-(^-i-0) 2i+2 



1 



i2i+2 



(34) 



(35) 



If this condition is actually satisfied (more or less), then de Boor considers it safe 
to assume that the difference between the two approximations T^j_x and T^j is a 
good bound for the error of Te,i, as is computed in (31). 

This error estimate for the regular case is itself, as defined at the beginning of 
this section, by no means non-linear. The reason for its inclusion in this category 
is the special treatment of integrable singularities in (33). 

3.2 Rowland and Varol's Modified Exit Procedure 

In 1972, Rowland and Varol [1972] publish an error estimator based on Simpson's 
compound rule. In their paper, they show that the "stopping inequality" 



S (m) [a,6]-S (2m) [a,6] 



> 



,6 

S (2m) [a,&] - / f(x)dx 

J a 
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is valid if f^(x) is of constant sign for x <E [a,b]. They also show that under 
certain conditions there exists an integer too such that the inequality is valid for all 
to > too. 

They note that for the compound Simpson's rule 

S( 2m )[a,fe]-S( 4m )[a,&] ~ 1 j 

holds, where usually q = 2. This condition is used to test if m is indeed large enough, 
much in the same way as de Boor's CADRE does (see (30)) to test for regularity. If 
this condition is more or less satisfied 13 for any given to, then they suggest using 

(S^[a k ,b k ]-S^[a k ,b k }) 2 
£k |S(-)[a fe ,6 fe ] -S(2m)[ afe?6fe ]| • V<> 

This error estimate can be interpreted as follows: Let us assume that 

e m = S (m) [a,6]-S (2m) [a,6] (38) 

is an estimate of the error of S^ m ^[a, b]. If we assume that the error estimates 
decrease at a constant rate r when the multiplicity to is doubled, then we can 
extrapolate the error of S*- 4m ' [a, b] using 

_ _ e 2m _ _ e 2m 

which is exactly what is computed in (37). 

A similar approach is taken by Venter and Laurie [2002] , where instead of using 
compound Simpson's rules of increasing multiplicity, they use a sequence of strati- 
fied quadrature rules, described by Laurie [1992]. In their algorithm, the sequence 
of quadratures of increasing degree Qi[et, b], Qz[a, 6], Qr[a, b], . . . , Q2i-i[a, b] is com- 
puted and the differences of pairs of these rules are used to extrapolate the error 
of the highest-order (ith) quadrature rule: 

£ fc = |Ms Ei = \Q*-i[a,b]-Q* + i_ 1 [a,b]\. (39) 
3.3 Laurie's Sharper Error Estimate 

In 1983, Laurie [1983] publishes a sharper error estimate based on two quadrature 
rules Q Q [a, b] and Qp[a, b] of degree a and ft respectively, where a > ft, or a = ft 
and Q a [a, b] is assumed to be more precise than Q^[a, b]: 



(q< 2) - Q< 2) ) (Qi 2) - Q 



in 



where the ranges [a k , bk] are omitted for simplicity. 



13 Since their paper does not include an implementation, no specification is given to how close to 
a power of two this ratio has to be. 
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Fig. 2. The error of the quadrature rules qL*™' (solid curve) and Q^ m ' (dotted curve) as a function 
of the number of panels or subdivisions m. 



He shows that this error estimate is valid when 

QL 2) 



qL 2) -q 



in 



< 



and < 



qL 1} 



/ Q?-i 
< -L — < i. 



(41) 



The former can be checked for in practice, yet the latter is impossible to verify since 
the exact integral / must be known. These two conditions imply that the error of 
Q„ [a. b] is smaller than and decreases at a faster rate than that of Qpla, b]. 

Laurie suggests a weaker condition that can be checked in practice, namely re- 

(2) 

placing / by Q a > [a k , b k ] + e k in (41), resulting in 



< 



Qi 2) - Q 



(i) 



< l. 



(42) 



Espelid and S0revik [1989] show, however, that this weaker condition is often sat- 
isfied when (41) is not, which can lead to bad error estimates 14 . 

The error estimate itself, based on these assumptions, is best explained graphi- 
cally (see Fig. 2). The errors of both rules Qa" 1 [a, b] and Ql m ^[a, b] are assumed to 
decrease exponentially with the increasing number of panels or subdivisions m: 

b — a x 



I = K 



m 

cfem, d n and d. 



n using 



ee [a,b\. 



We define the distances 

Qg m) -I = e, Qf m) -I = e + d 2m , 
Qi m) - I = e + da, Q { f " l) -I = e + d a + dm. 
Inserting these terms into the second inequality in (41), we obtain 

e , E + C?2m . . d a d2m 



< 



e < 



(43) 



(44) 



s + d a e + d a + d m dim — d m 

Resolving the distances using (43), we see that this bound is identical to the error 
estimate proposed by Laurie (40). 



14 Espelid and S0revik show that this is the case when using the 10-point Gauss rule and its 21- 
point Kronrod extension for Q^ 1 ' and G)L respectively and integrating fj 2 0.1/(0.01 + (x — A) 2 ) dx 
for 1 < A < 2. 
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In 1991, Favati et al. [1991b] publish a similar error estimator, based on four 
quadratures Q a [a,b], Qp[a, b], Q 7 [a, b] and Qs[a, b] of degree a > /? > 7 > 6 that 
satisfy the relations 

|/-Qa| < l-T-Qjl, |-f-Qa| < |/-Q 7 |, 

l'-<W * l'-<W. ^<-^y W 

For any ordering of the four estimates Q a , Q^, Q 7 and Qg around the exact 
integral /, we can define the distances d a = |Q a — I\, dp, d 1 and ds depending on the 
configuration of the estimates around I, similarly to (43). The algorithm therefore 
first makes a decision as to which configuration is actually correct based on the 
differences between the actual estimates. Based on this decision, it computes the 
d a , dp, d 1 and ds or bounds them using the first three relations in (45) and inserts 
them into the final relation in (45) to extract an upper bound for d a = \I — Q a \. 

Favati et al. test this algorithm on a number of integrands and show that the 
milder conditions in (45), which do not require that successive estimates decrease 
monotonically, arc satisfied more often than those of Laurie in (41). 

3.4 De Doncker's Adaptive Extrapolation Algorithm 

The probably best-known quadrature algorithm using non-linear extrapolation is 
published by de Donckcr [1978]. The main idea of the algorithm is similar to that 
of the Romberg scheme: Given a basic quadrature rule Q„[a, 6], the series 

Q« [a, 6] , Ql 2 > [a, b] , <#> [a, b] , . . . , Q<*> [a, b},... (46) 
converges exponentially, for large enough i and sufficiently smooth f(x), towards 

In Romberg's scheme, Qn [a, b] = T^ 1 ™) [a, b] , is the trapezoidal rule, and the limit 
of the series is extrapolated linearly using the the Romberg T-table. De Doncker's 
algorithm, however, differs in two main points: The 21-point Gauss-Kronrod rule 
is used as the basic rule Q!"^ [a, b] instead of the trapezoidal rule and the non-linear 
e-Algorithm [Wynn 1956] is used to extrapolate the limit of the series instead of 
the linear extrapolation in the Romberg T-tablc. 

The algorithm, as described thus far, is not yet adaptive. The main (and new) 
trick is that instead of using Q„ [a, b], de Doncker uses approximations Qi m -*[a, b]. 
Each approximation Qn" 1 '' [a, b] is computed by iteratively picking out the sub- 
interval of width greater than h = (b — a)/m with the largest local error estimate 

ek = \G w [ak,b k \-K 21 [a k ,b k ]\ (47) 

which is the same local error estimate as first used by Piessens (see Section 2.4), 
and subdividing it until cither the sum of the local error estimates Sk of all intervals 
of width larger than h is smaller than the required tolerance or there are no more 
intervals of width larger than h left to subdivide. 

In her original paper, de Doncker does not give any details on how the e- Algorithm 
is applied or how the global error is estimated. In its implementation as QAGS in 
QUAD PACK, the local error estimate (47) is replaced by the local error estimator 
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used in the other QUADPACK-routines (see Section 2.4, (26)). A global error 
estimate is computed for the extrapolated Ii using 

e t = \It - + \U - + \h - Ii_ z \ (48) 

where ij_2 and 7j_3 are the previous three estimates of the global integral. 

3.5 Summary 

Although most of the non-linear error estimators presented in this section differ 
significantly in their approach, they all rely on the same basic principle, namely the 
assumption that, for any quadrature rule Q( m ) [a, 6], for sufficiently smooth f(x) in 
the interval x € [a, 6], the error can be written as 

Q (m) [a,6]- / f{x)dx^ K h a , h=—^ (49) 
Ja m 

where re depends on the basic quadrature rule Q and the higher derivatives of the 

integrand and a is the order of the error. In the most general case, (49) has three 

unknowns, namely the actual integral I = f(x) dx, the scaling re and the order 

a of the error. The order a is usually assumed to be the order of the quadrature 

rule, but in the presence of singularities or discontinuities, this is not always the 

case. The three unknowns may be resolved using three successive approximations 

of increasing multiplicity: 

Q (m) = I + K h a (50) 

Q (2m) = I + K h a 2- a (51) 

Q (4m) = i + K h a 4r a (52) 
We can subtract (50) from (51) to isolate the error term 

n( m ) - (2m) 2" (OH _ Q(2m)\ 

K h a = r! r! — = — ri — L (53) 

1 - 2~ a 2 a - 1 

Re-inserting this expression into (51), we obtain 

/ = Q 



(2m) _ 22 !j 



2" - 1 

which is the linear extrapolation used in the Romberg T-table (for even integer 
values of a) and also used by de Boor's CADRE (see Section 3.1, (32)), where the 
Q(m) j Q(2m) and Q(4m) are the T-table entries Tt- 2 ,i, T^_ M and T Li respectively, 
for an unknown a. 

Inserting (53) into (51) and (52) and taking the difference of the two, we can 
extract 

Q(2m) _ Q(4m) 

Q('") - Q( 2 '») [ ' 

which is the ratio Ri used by de Boor ((30)) to approximate the order of the error 
(2" +1 therein). 

Inserting both (53) and (54) into the last estimate, (52), we obtain 

, (Q( 2 ™) -Q( 4 ™)) 2 

W Q(m) _ 2 Q(2m) + Q(4m) > 

ACM Computing Surveys, Vol. V. No. N, 20YY. 



20 • P. GONNET 



which is one step of the well-known Aitkcn A 2 -process [Aitkcn 1926]. 

The approach taken by Rowland and Varol (see Section 3.2) is almost identical, 
except that, instead of using the exact integral, they use 

Q(m) = Q(2m) +K/l ^ g(2m) = g(4m) + , Q( 4m ) = J + K h a 4T a (56) 

to solve for K,h a , 2~ a and the exact integral /, resulting in their simpler error 
estimate (see (37)). 

In a similar vein, Laurie (see Section 3.3) uses the four equations 

Q ( a 1] =I + Ka (b~a) a + 2 , Qi 2) ^ I + Ka (b - a) a+2 2-^\ 

Qf'=/ + ^(i-«r, Q<?> = I + np(b- a)^+ 2 2-W+ 2 ) [ ' 

which are, however, under-determined, since there are 5 unknowns (n a , ftg, a, /3 
and /). To get a bound on the equation, Laurie therefore adds the conditions in 
(41), obtaining the inequality in (44) from which he constructs his error estimate. 
Similarly, Favati, Lotti and Romani use the equations 

Q a = I + n a (b - a) a + 2 , Q fj =I + K P (b - af+ 2 , 
Q 7 = I + K 7 (b - a) 7+2 , Qs = I + K S (b - a) 5 + 2 , 

which have 8 unknowns, and which can be solved together with the four conditions 
in (45). 

Laurie and Venter's error estimator (see Section 3.2), differs in that, although 
similar in form to that of Rowland and Varol, the estimates 

Qi 1} =I + Ki{b- a) 3 , =1+ K 3 (b - a) 5 , . . . , Qg\ =1 + n 255 (b - a) 257 

form a set of n equations in n + 1 unknowns (/ and the n different «j, assuming, for 
simplicity, that the actual order of the error is that of the quadrature rule) which 
can not be solved as above. 

In summary, these methods, i.e. Romberg's method, the Aitken A 2 -process and 
Rowland and Varol's extrapolation, take a sequence of initial estimates Q( m ) , Q( 2 ™) , 
Q( 4to ), . . . and use them to create a sequence of improved estimates by removing 
the dominant error term as per (49). These approaches can, of course, be re-applied 
to the resulting sequence, thus eliminating the next dominant error term, and so 
on. This is exactly what is done in the columns of the Romberg T-table and in 
successive re-applications of the Aitken A 2 -process. 

Instead of successively and iteratively removing the dominant term in the error, 
we could also simply model the error directly as the sum of several powers 

Q^-I^n 1 h ai +K 2 h a2 +--- + K N h aN , h= h —^ (58) 

m 

Since this equation has 2A^+ 1 unknowns (the N constants Ki, the N exponents cti 
and the exact integral /), we need 2A^ + 1 estimates to solve for them: 

Q (m) = / + mh ai + K2 h a2 + ■■■ + K N h aN 

Q(2m) = j + + K2 / i «2 2 -«2 + . . . + K N h aN 2~ aN 



Q(2 2N m) = j + Klh ^i 2 - 2n ^ + K2 h a2 2- 2na2 +■■■+ K N h a »2- 2NaN (59) 
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This non-linear system of equations does not appear to be an easy thing to solve, yet 
in [Kahaner 1972] Kahaner shows that, if we are only interested in /, this is exactly 
what the e- Algorithm [Wynn 1956] docs. For an even number of approximations 
2N, the algorithm computes the same approximation as in (58), yet only over the 
first N — 1 terms, ignoring the first estimate Q( m ). 

Keeping (58) in mind, dc Doncker's error estimate (see Section 3.4, (48)) then 
reduces to 

e l ^2\n N h a "\ 

for N = [i/2\ , assuming that, ideally, for all estimates the right-most even column 
of the epsilon-table was used. 

Generally speaking, we can say that all the error estimators presented herein 
assume that the error of a quadrature rule Q^ n '[a, b] behaves as in (58). The 
unknowns in this equation (/ = J f(x)dx, Kj and o<j) can be solved for using 

several approximations Qn . 

In all these methods, the error estimate is taken to be the difference between the 
last estimate and the extrapolated value I of the integral. In the case of de Boor's 
CADRE, this is the difference between the last two entries in the bottom row of the 
modified T-table, and for Rowland and Varol (Section 3.2), Laurie (Section 3.3) 
and Favati, Lotti and Romani (Section 3.3), this is Q( 4m ) — 7, Q^ 2 -* — I and Q a — I 
respectively 

If the exponents Oj are known or assumed to be known, the resulting system is 
a linear system of equations. This is what Romberg's method does quite explicitly 
and what many of the error estimators in Section 2 do implicitly. If the exponents 
ctj are not known, the resulting system of equations is non-linear and can therefore 
only be solved for non-linearly. 

The non-linear methods discussed here are therefore a conceptual extension of 
the linear error estimators presented earlier. As such, they are subject to the same 
problem of the difference between two estimates being accidentally small in cases 
where the assumptions in (49) or (58) do not actually hold, as is the case for 
singular or discontinuous integrands. The different error estimation techniques in 
this section differ only in the depth N of the expansion and the use of additional 
constraints when the resulting system of equations is under-determined. 

4. A NEW ERROR ESTIMATOR 

In the following, we will present a new type of error estimator introduced by the 
author in [Gonnet 2010]. For the construction of this error estimator, we will begin 
with an explicit representation of the integrand. In almost all previously presented 
error estimators, the integrand itself is represented only by its approximated integral 
or, in the best of cases (see Section 2.3), only a few higher-order coefficients relative 
to some base. 

By definition, every intcrpolatory quadrature rule implicitly constructs an inter- 
polation polynomial g n {x) of degree n — 1 of the integrand f(x) at the nodes Xi, 
i = 1 . . . n and computes the integral of the interpolation. This equivalence is easily 
demonstrated, as is done in many textbooks in numerical analysis ([Stiefcl 1961; 
Rutishauser 1976; Gautschi 1997; Schwarz 1997; Ralston and Rabinowitz 1978] to 
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name a few) 15 . 

For our new error estimate, we will represent the interpolant g n {x) explicitly as 
a weighted sum of orthonormal Legcndre polynomials 

n-l 

g n (x) = c k p k (x). (60) 

The interpolant g n {x) interpolates the integrand f(x) on the transformed interval 
from [a, b] to [—1, 1] at the nodes Xi, i = 1 . . . n: 

9n{Xi) = f(xi) = f ( - ~~o~ x i \ ' x * G X ]- ( 61 ) 



Given the function values f = (f(xi),f(x2),.-.,f(x n )) T , we can compute the 
vector of coefficients c = (co , ci , . . . , c„_ i ) T by solving the linear system of equations 

Pc = f (62) 

where the matrix P with Pij = Pj(xi) on the left-hand side is a Vandermonde- 
like matrix. The naive solution using Gaussian elimination is somewhat costly and 
may be unstable [Gautschi 1975]. However, several algorithms exist to solve this 
problem stably in 0(n 2 ) operations for orthogonal polynomials satisfying a three- 
term recurrence relation [Bjorck and Pereyra 1970; Higham 1988; 1990; Gonnct 
2009b]. 

Given such a representation as in (60), the integral of g n {x) can be computed as 

/l n — 1 n — 1 

g n (x)dx = J^c fc / p k {x)dx = ^ c k uj k = u) T c. (63) 
_1 k=0 fe=0 

Using orthonormal Legendre polynomials, the coefficients arc simply uj t = (l/v2, 0, . 
We can formulate the integral approximation as the scalar product of the vector of 
coefficients c with a vector of weights u>: 

Q n [a, ] = ^_^u; T c. (64) 

Another useful feature of such a representation is that it can be easily transformed 
to a sub-interval. Let Cj,i = 0...n — lbe the coefficients of the interpolation g n (x) 
in the interval [a, b]. Given the matrix T^' with entries 



TS = I Pj(*)Pi ( ^ ) ^ (65) 



i 



x- 1 



2 



we can compute the coefficients cf \ i = . . .n — 1 of the interpolation g„\x) over 
the left half of the interval [a, (a + b)/2] using e^> = T^c where the resulting 



15 If we consider the Lagrange interpolation g n (x) of the integrand and integrate it, we obtain 

/b rb n 11 rb n 

g n (x)dx = ^2£i(x)f(xi)dx = ^2f(xi) tj{x)dx = y^J(x i )w l 
Ja i=o i=o Ja i=0 

where the £i(x) are the Lagrange polynomials and the w, are the weights of the resulting quadra- 
ture rule. 
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polynomial gn\x) over [—1,1] is identical to g n (x) over [—1,0] (gn'(x) — g n (^j^), 

xe [-1,1]). 

Analogously, we can create the matrix T( r ) such that = T^c are the co- 
efficients of the right half of g n {x) transformed to [—1,1]. Such upper-triangular 
matrices can be constructed to transform g n {x) to any sub- interval. 

A final useful feature is that given the coefficients Ci , i = ... n — 1 of any 
interpolation g n {x), we can compute its L2-norm using Parseval's theorem: 

1/2 



g(xf dx 



1/2 



,i=0 



C 2 



(66) 



which is simply the Euclidean norm of the vector of coefficients c. In the following, 
we will use || • || to denote the 2- norm. 

Instead of constructing our error estimate by approximating the difference of the 
integral of the interpolation g n (x) to the integral of the integrand f(x) directly, as 
is done in practically all the methods presented in Section 2 and Section 3, we will 
consider the L2-norm of the difference between the integrand and its interpolant: 

1/2 



f( x )-9n(x)) dx 



(67) 



The proposed error estimate in (67) is, save for a constant factor of y/2, an upper 
bound of the integration error of the interpolant g n (x) w 



QnhM] 



f(x)dx 



(dn{x) ~ f(x))dx 



and will only be zero if the interpolated integrand matches the integrand on the 
entire interval (g n (x) = fix), x £ [—1,1]). In such a case, the integral will also be 
computed exactly. The error (67) is therefore, assuming we can evaluate it reliably, 
not susceptible to "accidentally small" values. 

Since wc do not have an exact representation of the integrand f(x), we can not 
compute (67) exactly. We can, however, generate a first trivial error estimate using 
two interpolations gni(x) and gn^ix) °f different degree where 712 > n\. If wc 
assume that gn}(x) interpolates the integrand f(x) much better than does gn}{x), 
then wc can assume that 



f(x)-gW(x)*gW(x)-gW(x) 



(68) 



16 This can be shown using the Cauchy-Schwarz inequality 

2 r i 



/i r 1 r 1 

<f>(x)ip(x) dx < J \4>(x)\ 2 dx J \tp(x)\ 2 dx. 



for ?p(x) = 1 wc obtain 



and finally 



p^{x)dx < 2j 1 ^\4>(x)\ 2 



dx, 



J 4>{x)dx <V2^J \4>(x)\ 2 dx^j 



1/2 
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that is, that f(x) on the left-hand side can be replaced with g n2 (x), similarly to 
Piessens' and Patterson's error estimates (see Section 2.4), in which the estimate 
from a higher-degree rule is used to estimate the error of a lower-degree rule. Taking 
the L2-norm from the left-hand side of (68), we obtain 



£i = 



b — a, 



(69) 



where c^ 1 ) and are the vectors containing the coefficients of the interpolants 
gn}(x) and g^ 2 \x) respectively and = where i>n\. 

This error estimate, however, is only valid for the lower-degree interpolation 
gni {x) and would over-estimate the error of the higher-degree interpolation g„] (x) 
which we would use to compute the integral. For a more refined error estimate, we 
could consider the interpolation error 



f(x) - g n (x) = i—^Zl^x), & g i] 



(70) 



for any n times continuously differentiable f(x) where £ x depends on x and where 
Tr n (x) = nr=i( a; — x i) ls * ne Newton polynomial over the n nodes of the quadrature 
rule: 

Taking the L 2 -norm on both sides of (70) we obtain 



9n{x) - f{x)\ dx 



1/2 



n 2 n {x) dx 



1/2 



Since 7r^(x) is, by definition, positive for any x, we can apply the mean value 
theorem of integration and extract the derivative resulting in 



\9n{x) 
1 v 



/(*), 



) 2 dx 


1/2 


f w (0 




J nl(x)dx 


1/2 






n\ 









€e[-l,l]. 



(71) 

If we represent the polynomial n n (x) analogously to g n (x), as 7r„(x) = J2k=o bkPk{x) 
then we can compute its L2-norm as ||b||, where b is the vector of the n + 1 co- 
efficients 17 6fe. Therefore, the terms on the right-hand side of (71), only the nth 
derivative of the integrand is unknown. 

Given two interpolations of the integrand, gn(x) and gn\x), of the same degree 
yet not over the same set of nodes, if we assume that the derivative / ( -™- ) (^) is 
constant for £ g [a, b} 18 , we can extract the unknown derivative as follows: 



9 { n ] {x) 



(x) 



/(«)(£) 



r (2) 



(X) 



(72) 



17 Higham [1988] shows how the coefficients of a Newton-like polynomial can be computed relative 
to any orthogonal base. 

18 This assumption is a stronger form of the "sufficiently smooth" condition, which we will use 
only to construct the error estimator. 
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where iTn(x) and 7T„ (x) are the nth Newton polynomials over the nodes of gn\x) 
and g„ ' (x) respectively. Taking the L 2 -norm on both sides of (72), we obtain 



/»(£) 



42)1 



b« -b( 2 ) 



(73) 



from which we can construct an error estimate for either interpolation 



1/2 



.(1) 



42)1 



bW-bW 



■||b«||, fee {1,2}. (74) 



Note that for this estimate, we have made the assumption that the nth derivative 



is constant. We can't verify this directly, but we can verify if our computed 
(73) actually satisfies (70) for the nodes of the first interpolation by testing 



g^ix^-fixi) 



<0i 



/(«)(£) 



where the Xi are the nodes of the interpolation g^ (x) and the value i?i > 1 is an 
arbitrary relaxation parameter. If this condition is violated for any of the Xi, then 
we use the un-scaled estimate as in (69). 

In practice, we can implement this error estimator in a recursive adaptive quadra- 
ture by first computing the n coefficients Ck of g n (x) in the interval [a, b}. The n + 1 
coefficients bk of the nth Newton polynomial over the nodes of the basic quadrature 
rule can be pre-computed. 

For the first interval, no error estimate is computed. The interval is bisected and 
for the recursion on the left half of [a, b], we compute 19 

c oid = T (£) c ^ b oid = 2 "T w b. 

Inside the left sub-interval [a, (a + b)/2], we then evaluate the new coefficients c. 
Given the old and new coefficients, we then compute the error estimate 

(b-a) l|c-c° ld || 
£2= 2 ||b-b°"H l|b|1 - (76) 



1 



(75) 



5. COMPARISON 

In the following, we will compare the performance of some of the error estima- 
tion techniques presented in §Section 2 and 3, including the new error estimator 
presented in Section 4. 

5.1 Methodology 

Whereas other authors [Casalctto et al. 1969; Hillstrom 1970; Kahaner 1971; Mal- 
colm and Simpson 1975; Robinson 1979; Krommer and Uberhuber 1998; Favati 
et al. 1991a] have focused on comparing different algorithms as a whole, using sets 
of functions chosen to best represent typical integrands, we will focus here only on 



19 Note that to compute b old we would actually need to extend TW and, since b old and b are not 
in the same interval, we have to scale the coefficients of b old by 2 n so that Equation 70 holds for 
9t?^ i x ) m the sub-interval. 
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specific error estimators and on integrands chosen such that they specifically should 
or should not cause the error estimator to fail. 

For these test functions we will not consider the usual metrics of efficiency, 
i.e. number of function evaluations required for a given accuracy, but the number of 
correct estimates, false negatives and false positives for each error estimator when 
integrating functions which it should or should not integrate correctly, respectively. 

We define a false positive as a returned error estimate which is below the required 
tolerance when the actual error is above the later. Likewise, a false negative is a 
returned error estimate which is above the required tolerance when the actual error 
is below the later. 

In practical terms, false negatives are a sign that the error estimator is overly cau- 
tious and continues to refine an interval even though the required tolerance would 
already have been achieved. False positives, however, may cause the algorithm to 
fail completely: if the actual error in a sub-interval is larger than the global toler- 
ance, no amount of excess precision in the other intervals will fix it and the result 
will be incorrect, save an identical false positive elsewhere of opposite sign. 

The test integrands, along with an explanation of why they were chosen, are: 

(1) p n (x): The Chebyshev polynomial of degree n in the interval [—a, /3], where a 
and (3 are chosen randomly in (0, 1] and n is the degree of the quadrature rule 
for which the error estimate is computed 20 . The polynomial is shifted by +1 
to avoid an integral of zero. 

(2) p n+1 (x): Same as the function above, yet one degree above the degree of the 
quadrature rule. Although this integrand is, by design, beyond the degree of 
the quadrature rule, the error term (i.e. the n+ 1 st derivative) is constant and 
can be extrapolated reliably 21 . 

(3) p n +2{x). Same as the function above, yet two degrees above the degree of the 
quadrature rule. By design, the n + 1st derivative is linear in x and changes 
sign inside the interval, meaning that any attempt to extrapolate that derivative 
from two estimates of equal degree may fail. 

(4) dk{x): A function with a discontinuity at x = a in the A:th derivative, where a 
is chosen randomly in the interval of integration [—1, 1] for k = 0, 1 and 2: 



Since all quadrature rules considered herein are interpolatory in nature and 
these integrands can not be reliably interpolated, these functions will only be 



20 For error estimates computed from the difference of two quadrature rules of different degree, 
the degree of the quadrature rule of lower degree is used since although the result rule of higher 
degree is effectively used for the returned integrand, the error estimate is usually understood to 
be that of the lower-degree rule. 

21 e.g. as is done implicitly in SQUANK (see Section 2.1, (9)) or explicitly in Ninomiya's error 
estimator (see Section 2.2, (13)) 




(77) 



di(x) 
d 2 {x) 



max {0, x — a} 
(max {0, x — a}) 2 



(78) 
(79) 
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correctly integrated by chance 22 . 
(5) s(x): A function with an integrable singularity at x = a, where a is chosen 
randomly in (—1,1): 



As with the previous set of functions, this function can not be reliably inter- 
polated and an intcrpolatory quadrature rule will produce a correct result only 
by chance 23 . 

These functions were tested for 10 000 realizations of the random parameters a 
and (3 for each of the relative tolerances r = 10 _1 , 10~ 3 , 10~ 6 , 10 -9 and 10~ 12 . 
Since most error estimators use absolute tolerances, the tolerance was set to the 
respective fraction of the integral. The following representative 24 error estimators 
were implemented in Matlab (2007a, The MathWorks, Natick, MA.) 25 and tested: 

(1) Kuncir's error estimate (Section 2.1, (3)), where n = 3 is the degree of the 
composite Simpson's rules used, 

(2) Oliver's error estimate (Section 2.3, (18)), starting with a Clenshaw-Curtis rule 
of degree 3, where n = 9 is the degree of the second-last rule used and the 
first error estimate below tolerance is returned or 2r if the interval is to be 
subdivided, 

(3) QUADPACK's QAG error estimator (Section 2.4, (26)) using the 10-point Gauss 
quadrature rule with n = 19 and its 21-point Kronrod extension, 

(4) Berntsen and Espclid's null-rule error estimate (Section 2.3, (22) and (23)) 
using, as a basic quadrature rule, the 21-point Clenshaw-Curtis quadrature 
rule 26 with n = 21 and values K = 3, r cr j t icai = 1/4 and a = 1/2. 

(5) Gander and Gautschi's error estimate as implemented in Matlab's quadl (Sec- 
tion 2.4, (28)) using the 4-point Gauss-Lobatto quadrature rule with n = 5 and 
its 7-point Kronrod extension, 

(6) Laurie's sharper error estimate (Section 3.3, (40)) using the 10-point Gauss 
quadrature rule with n = 19 and its 21-point Kronrod extension for the two 
rules and Q a respectively, as suggested by Laurie [1985] himself, 

(7) The trivial error estimate (Section 4, (69)) using the nodes of the n = m = 
11 and 7i2 = 21-point Clenshaw-Curtis quadrature rules to compute the two 
interpolations gn} (x) and g£} (x) respectively. 

(8) The more refined error estimate (Section 4, (76)) using the nodes of an 11- 
point Clenshaw-Curtis quadrature rule with n — 10 and one level of recursion 
to obtain c old , as well as 1.1 for the constant $i in (75). 

22 The only exception is CADRE (see Section 3.1), which attempts to detect jump discontinuities 
explicitly 

23 The only exception is again CADRE, which treats such singularities explicitly when detected (see 
Section 3.1, in cases where a is near the edges of the domain 

24 For compactness, the results for similar error estimators were omitted. The results for most 
other error estimators can be found in [Gonnet 2009a] . 

25 The Matlab source code of each routine tested is available from this author online at 
http: //people . inf . ethz . ch/gonnetp/csur/. 

26 the 21-point Gauss quadrature rule was also tried but left out since it produced worse results, 
i.e. more false positives. 



s(x) = \x — a 



1-1/2 
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Function 


r= 10- 1 


T = ID" 3 


T = 10~ 6 


T= lO- 9 


T = 10- 12 


p„M 

Pn+lM 

d Q (x) 
dl(a;) 
ds(x) 
»(*) 


100 (0/0) 
65.67 (0/34.33) 
51.07 (0/48.93) 
16.58 (0/22.27) 
44.92 (0/29.98) 
54.30 (0/22.66) 
(33-05/17.11) 


100 (0/0) 
8.49 (0/21.14) 
8.44 (0/15.77) 

(0/0.35) 
0.73 (0/1.59) 
5.74 (0/7.16) 
(0.42/0.20) 


100 (0/0) 
0.38 (0/0.76) 
0.50 (0/1.07) 
0(0/0) 
(0/0) 
0.22 (0/0.12) 
(0/0) 


100 (0/0) 
0.01 (0/0.02) 
0.03 (0/0.11) 
(0/0) 
(0/0) 
0.01 (0/0) 
(0/0) 


100 (0/0) 
(0/0.01) 
0.02 (0/0) 
(0/0) 
(0/0) 
(0/0) 
(0/0) 



Table I. Results for Kuncir's 1962 error estimate. 




Fig. 3. The integrand assumed by the 9-point Clenshaw-Curtis rule (left, dotted) used in Oliver's 
1972 error estimate and the 21-point Clenshaw-Curtis rule (right, dotted) used in Berntsen and 
Espelid's 1991 error estimate for the singular integrand s(x) (solid). 



Function 


T = 10" 1 


T = 10~ 3 


t = 10- 8 


T = 10-9 


-r = 10- 12 


Pti (32) 


65.69 (2.40/31.91) 


22.20 (0.25/77.55) 


8.67 (0/91.33) 


2.77 (0/97.23) 


0.72 (0/99.28) 


Pn + l( x ) 


55.07 (3.87/41.06) 


18.34 (0.22/69.25) 


6.03 (0/21.70) 


1.18 (0/5.57) 


0.23 (0/1.13) 


Pn + 2( x ) 


49.62 (5.79/44.59) 


14.93 (0.30/64.31) 


5.72 (0/18.04) 


1.52 (0/5.15) 


0.50 (0/1.58) 


d (x) 


20.44 (0/35.08) 


(0/0.64) 


(0/0) 


0(0/0) 


0(0/0) 


dl(a0 


71.27 (0.86/18.23) 


3.60 (6.96/10.86) 


(0/0.03) 


0(0/0) 


0(0/0) 


d 2 (x) 


78.09 (0/16.14) 


23.55 (5.33/18.77) 


0.35 (0/0.90) 


0.01 (0/0.03) 


0(0/0) 


•(») 


2.06 (66.71/15.27) 


(0.60/0.23) 


(0/0) 


(0/0) 


0(0/0) 



Table II. Results for Oliver's 1972 error estimate. 

5.2 Results 

The results of the tests described in Section 5.1 are shown in Tables I to VIII. For 
each integrand and tolerance, the percentage of correct integrations is given (i.e. the 
error estimate and the actual error are both below the required tolerance), as well 
as, in brackets, the percentage of false positives and false negatives respectively. 

Despite the low degree of the quadrature rule and its simplicity, Kuncir's error 
estimate (Section 2.1) performs rather well: almost all functions return no false pos- 
itives and relatively few false negatives. Only the singularity returns false positives 
for t = 1CP 1 in more than a third of the cases. 

Oliver's 1972 error estimate (Section 2.3) mis-predicts the errors for all three 
polynomials p n (x), Pn+i(%) and p„+2(x), due to the large higher-degree coefficients 
of the integrands. The false positives are cases where the doubly-adaptive algorithm 
exited after incorrectly predicting the error with a lower-order rule. This is also true 
for the discontinuities c?o Oz) , d\ (x) and di (x) , which are detected well by the higher- 
order rules since the higher-degree Chebyshev coefficients become relatively large, 
yet fail when the error is mis-predicted by the lower-degree rules. The algorithm 
fails when integrating the singularity s(x), since the coefficients of the interpolation 
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Function 


^=10-! 


T = 10~ 3 


T = 10- 6 


T = 10-° 


r = 10-1 2 


dot*) 

d 3 (x) 
»(•) 


100 (0/0) 
84.04 (0/15.96) 
76.68 (0/23.32) 
6.04 (0.32/79.64) 
22.50 (0.21/76.36) 
57.99 (0.18/41.19) 
0.26 (0.54/62.29) 


100 (0/0) 
70.01 (0/29.99) 
60.87 (0/39.13) 
0.11 (0.29/2.06) 
1.43 (0.35/44.96) 
15.36 (0.28/67.99) 
(0.03/0.35) 


100 (0/0) 
47.75 (0/52.25) 
38.91 (0/61.09) 

(0.49/0) 
0.12 (0.45/0.22) 
0.79 (0.30/5.23) 
0(0/0) 


100 (0/0) 
30.61 (0/69.39) 
25.60 (0/74.40) 
(0.45/0) 
0.01 (0.52/0) 
0.09 (0.34/0) 
0(0/0) 


100 (0/0) 
18.19 (0/81.81) 
16.22 (0/83.78) 
(0.38/0) 
(0.44/0) 
0.03 (0.48/0) 
(0/0) 




Tabic III. Results for Piessens et al. 's 1983 error estimate. 




Function 


T = 10-1 


T = 10- 3 


T = ll)- 6 


T = 10- 9 


T = 10- 12 


Pn+l( 3 ) 
P7i+2( a; ) 

dot*) 

di(*) 
d 2 (*) 
«(») 


51.98 (0/18-02) 
48.42 (0/51.58) 
43.89 (0/56.11) 
53.45 (0/31.20) 
85.10 (0/13.32) 
90.18 (0/8.94) 
13.03 (28.88/45.80) 


23.69 (0/76.31) 
21.97 (0/78.03) 
20.23 (0/79.77) 

(0/1.86) 
3.76 (0/41.23) 
34.92 (0/47.13) 

(0/0.34) 


8-15 (0/91.85) 
7.24 (0/78.24) 
6.77 (0/71.77) 
(0/0) 
(0/0.26) 
0.27 (0/5.23) 
(0/0) 


2.56 (0/97.44) 
2.13 (0/54.11) 
2.34 (0/45.22) 

(0/0) 

(0/0) 
(0/0.11) 

(0/0) 


0.97 (0/99.03) 
0.84 (0/29.48) 
0.73 (0/26.05) 

0(0/0) 

0(0/0) 

(0/0) 

(0/0) 



Table IV. Results for Berntsen and Espelid's 1991 error estimate. 



Function 


T= 10-1 


T = 10 — 3 


t = lO -6 


t = 10-° 


T = 10-12 


fnW 


100 (0/0) 


100 (0/0) 


100 (0/0) 


100 (0/0) 


100 (0/0) 


Pn+l(") 


80.08 (0/19.92) 


17.69 (0/82.31) 


0.56 (0/99.44) 


(0/100) 


(0/99.99) 


fn+2(') 


68.15 (0/31.85) 


17.88 (0/82.12) 


2.46 (0/97.54) 


0.33 (0/99.67) 


0.08 (0/99.92) 


d (x) 


10.33 (0/39.32) 


(0/0.59) 


0(0/0) 


(0/0) 


(0/0) 


d l ( x ) 


63.43 (2.32/23.63) 


0.70 (1.33/9.97) 


0(0/0) 


(0/0) 


(0/0) 


d 2 ( = ) 


68.98 (0/19.77) 


8.69 (0.03/25.79) 


0.31 (0/0.13) 


0.02 (0/0.01) 


0(0/0) 


«(») 


(44.15/22.67) 


(0.50/0.22) 


0(0/0) 


(0/0) 


(0/0) 



Table V. Results for Gander and Gautschi's 2001 error estimate. 

often decay smoothly, misleading it to believe the integrand itself is smooth (see 
Fig. 3, left). 

QUADPACK's error estimate (Section 2.4) does a very good job over all functions 
(Table III) . The error estimate generates a high number of false negatives for the 
polynomials p n+ i(x) and p„+2(x) since the quadrature rule used to approximate 
the integral is several degrees more exact than that for which the returned error 
estimate is computed. The few false positives are due to the error estimate's scaling 
of the error, causing it to under-predict the actual error and to cases where the 
discontinuity at a was outside of the open nodes of the quadrature rule. The false 
positives for the discontinuities do (2)1 d\(x) and d,2{x) and the singularity s(x) at 
r = 10 -1 are due to accidentally small differences between the Gauss and Gauss- 
Kronrod approximations. 

Berntsen and Espelid's null-rule error estimate (Section 2.3) suffers from the 
same problems as Oliver's error estimate for the polynomial p n (x): Although the 
integration is exact, the coefficients Cj increase towards i — n, leading the algorithm 
to believe that the n + 1 st coefficient will be large when it is, in fact, zero. The 
algorithm mis-predicts the error for the singularity s(x) for the same reason as 
Oliver's algorithm, namely that the coefficients of the polynomial interpolation 
decrease smoothly, falsely indicating convergence (see Fig. 3, right). 

Gander and Gautschi's error estimate (Section 2.4) generates a high number 
of false negatives for p n+ i(x) and p„+2(x), due to the higher degree of the esti- 
mate effectively returned. The error estimation returns some false positives for the 
discontinuities dg(x), d\(x) and d2(x), as well as for the singularity s(x), due to 
the difference between the two quadrature rules used being "accidentally small" 
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-0.05 I 1 i 1 

-1 -0.5 0.5 1 

Fig. 4. The integrands assumed by the Gauss-Lobatto (dashed) and Gauss-Kronrod (dotted) 
quadrature rules in Gander and Gautschi's 2001 error estimate over the discontinuous integrand 
d\(x) (solid). 



Function 


t = in- 1 


x = 10~ 3 


T = 10- 6 


x = 10- 9 


x = 10- 12 




100 (0/0) 


100 (0/0) 


100 (0/0) 


100 (0/0) 


100 (0/0) 


Pn+l(") 


100 (0/0) 


100 (0/0) 


100 (0/0) 


100 (0/0) 


100 (0/0) 


Pn + 2l» 


100 (0/0) 


100 (0/0) 


100 (0/0) 


100 (0/0) 


100 (0/0) 




30.26 (0.09/62.46) 


0.12 (0.09/3.93) 


(0.18/0.01) 


(0.20/0) 


(0.24/0) 




36.78 (0.07/62.75) 


24.67 (3.78/48.51) 


0.25 (1.14/0.55) 


(0.41/0.01) 


(0.46/0) 




44.81 (0.11/54.70) 


40.21 (0.94/51.18) 


3.52 (4.74/15.25) 


0.14 (0.13/0.16) 


0.03 (0.32/0) 


«(«) 


25.01 (0.06/64.82) 


(4.52/0.52) 


(0.03/0) 


(0/0) 


0(0/0) 



Tabic VI. Results for Laurie's 1983 error estimate. 



Function 


X = 10-1 


x = 10~ 3 


x = 10- 8 


x = 10-9 


x = 10- 12 


Pn (x) 

Pn + l(^) 

Pu+2f I ) 

d (x) 

d 1 {x) 

d 2 (x) 

»(*) 


100 (0/0) 
89.78 (0/10.22) 
81.73 (0/18.27) 

(0/84.09) 
66.03 (0/32.46) 
76.67 (0/22.50) 

(0/59.16) 


100 (0/0) 
52.10 (0/47.90) 
40.76 (0/59.24) 

(0/2.31) 
0.34 (0/44.30) 
16.19 (0/65.95) 

(0/0.39) 


100 (0/0) 
14.80 (0/85.20) 
12.22 (0/87.78) 
(0/0) 
(0/0.28) 
0.16 (0/5.34) 
(0/0) 


100 (0/0) 
4.06 (0/95.94) 
4.52 (0/95.48) 
0(0/0) 
0(0/0) 
0.01 (0/0.12) 
0(0/0) 


100 (0/0) 
1.12 (0/98.88) 
1.34 (0/98.66) 

(0/0) 

(0/0) 

(0/0) 

(0/0) 



Table VII. Results for Gonnct's 2009 trivial error estimate. 



(e.g. Fig. 4). 

Laurie's error estimate (Section 3.3) is exact even for the polynomials p n +i(x) 
and p n +2{x)'. despite being of higher degree than the second- highest degree rule, 
the error of the highest-degree rule is correctly extrapolated. The discontinuities 
do(x), di(x) and d,2(x) and the singularity s(x) are not always detected since the 
condition in (42) holds in some cases where the necessary condition in (41) does 
not, resulting in some false positives over all tolerances. 

In both new error estimates described in Section 4, the errors of the polynomials 
p n +i(x) and p n +2(x) tend to be over-estimated as the computed L2-norm is a 
somewhat pessimistic estimate of the integration error. What is notable is that 
these error estimates never under-estimated the error, resulting in no false positives 
at all. 
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Function -r ^ IP -1 t = 10~ 3 t = 10~ 6 r = 10~ 9 r = 10~ 12 

p„(x) 100 (0/0) 100 (0/0) 100 (0/0) 100 (0/0) 100 (0/0) 
P?l + 1 (a0 100 (0/0) 100 (0/0) 58.76 (0/41.24) 17.49 (0/82.51) 5.15 (0/94.85) 
Pn + 2< x ) 83.30 (0/16.70) 58.78 (0/41.22) 28.18 (0/71.08) 9.05 (0/46.17) 3.03 (0/14.26) 
drjM 0(0/81.48) 0(0/2) 0(0/0) 0(0/0) 0(0/0) 
dl(x) 68.87 (0/27.89) 0.40 (0/54.34) (0/0.10) (0/0) (0/0) 
d2<^) 82.21 (0/15.81) 17.88 (0/58.11) 0.22 (0/5.08) (0/0.07) (0/0) 
s(x) (0/59.19) (0/0.33) (0/0) (0/0) (0/0) 



Table VIII. Results for Gonnet's 2009 refined error estimate. 

5.3 Summary 

According to the results using the chosen test integrands, the best two error estima- 
tors appear to be that of Piessens et al. (Section 2.4) which is the error estimator 
for the adaptive routines in the popular integration library QUADPACK, and the 
two new error estimators presented herein (Section 4). 

The relatively few false positives returned by the QUADPACK error estimator 
may seem negligible in contrast with its efficiency (evidenced by the much smaller 
percentage of false negatives) compared to the new error estimate. We can verify 
this by evaluating the smooth integral 

( ; T77 dx 

J 1 0.01 + (x- A) 2 

first suggested by Lyncss and Kaganove [1976], for which we compute 1 000 realiza- 
tions of the parameter AG [1,2]. We use both Piessens et al. 's error estimate and 
the two new error estimates as implemented for the previous tests in a recursive 
scheme as in Algorithm 1 with r' = t/\/2, to a relative precision of r = 10~ 9 . 
On average, Piessens et al. 's error estimate requires 157 function evaluations while 
the new error estimates require 379 and 330 evaluations respectively - roughly 
more than twice as many. Both methods integrate all realizations to the required 
tolerance. 

If we consider, however, the Waldvogel 27 function 

W{x) = f [e*J dt 
Jo 

which wc wish to evaluate to the relative precision r = 10 -9 for 1 000 realizations 
of x G [2.5, 3.5] using both the error estimates of Piessens et al. and our new error 
estimators as described above, we get very different results. While Piessens et al. 's 
error estimator fails in roughly three quarters of all cases (753 failures out of 1 000, 
see Fig. 5), usually missing a sub- interval containing one or more discontinuities and 
using, on average, 29 930 function evaluations, our new error estimators succeeds on 
every trial, using on average 31 439 and 29 529 function evaluations respectively. For 
this integrand, a single bad error estimate is sufficient for the entire computation 
to fail and, in this case, the cautious estimate pays off. 

6. CONCLUSIONS 

In this review we have analyzed a large part of error estimates for adaptive quadra- 
ture published in the last 45 years or so. We have shown that all these estimates 



This function was suggested to the author by Prof. Jorg Waldvogel. 
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Fig. 5. Piessens et al. 's error estimate used to evaluate one realization of the Waldvogcl-function. 
The circles mark the edges of the sub-intervals. Note that the integrand is not well resolved near 
x ps 1.4 and x ss 2.6. 



can be reduced to either a linear or non-linear approximation of the integral and 
one or more error terms of the underlying quadrature rule: 

Qi m) [a,6]=/ f(x)dx + K 1 h ai +K 2 h a ' 2 +--- + n N h aN , h=^—. (80) 



For the linear error estimators discussed in Section 2, the exponents on, i = 1 . . . N 
are assumed to be known. For the non-linear error estimators discussed in Section 3, 
the aj, i — 1 . . . N are not assumed to be known and are also approximated. In 
both cases, N is usually 1 with the exception of de Boor's CADRE (see Section 3.1) 
and de Doncker's adaptive extrapolatory algorithm (see Section 3.4). 

These error estimators all fail for the same reason, namely when the difference 
between two successive quadratures is 11 accidentally smaW . This can happen when 
the actual error contains more significant terms than the ones shown in (80). 

The new error estimators presented in Section 4 are no different as they approx- 
imate the error for JV = 1 and a supposed ct\ = n + 1. The main difference is 
that instead of using different approximations of the integral of different quadra- 
ture rules, we use the L2-norm of the difference of the interpolating polynomials of 
different quadrature rules to approximate the unknown terms in (80). As we will 
see, this significantly reduces the probability of accidentally small differences, and 
thus avoid the major cause of failure of the other algorithms, as is demonstrated 
by the results in Section 5. 

The reason in this increased reliability is best explained by considering, for any 
error estimator, the set of integrands for which it will always fail. Consider the 
polynomials orthogonal with respect to the discrete product 

n 

(Pi(x),pj(x)) = 22pi(x k )pj{x k ), i,j = 0...n (81) 
fc=i 

where the Xk are the nodes of the quadrature rule or the combined nodes of all the 
quadrature rules used in the computation of the error estimate in the interval. In 
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the following, when we refer to a pair of functions being orthogonal, we understand 
them to be orthogonal with respect to the above product. 

For any linear error estimate relying on the difference between two quadrature 
rules over the nodes Xi, the error estimate can be computed as 

n 
i=l 

where the rji arc the difference of the weights of the two quadrature rules used in 
the error estimate for each node 28 . Let rj(x) be the polynomial interpolating the 
rji at the nodes a;,, i = 1 . . . n. The error can then be computed as the product in 
(81) applied to the integrand fix) and the polynomial r](x): 

£ = (v{x),f(x)}. 

Therefore, if the integrand f(x) is of algebraic degree higher than that of the 
quadrature rule used — and will therefore not be correctly integrated — and the 
integrand f(x) is orthogonal to the polynomial rj(x), then the linear error estimate 
will be zero and therefore it will fail. 

For the error estimate of O'Hara and Smith (Section 2.3) and of Oliver (Sec- 
tion 2.3), which use more than one derivative, the error estimate fails when the 
integrand fix) is of higher algebraic degree than the basic quadrature rule and 
the coefficients c n , c„_2 and c„_4 are zero (see (16)). This is the case when the 
integrand f(x) is orthogonal to the Chebyshev polynomials T n (x), T n _ 2 (x) and 
T n -i(x). 

For the error estimate of Berntsen and Espelid (Section 2.3), the error estimate 
fails when the integrand f{x) is of higher algebraic degree than the basic quadrature 
rule and the integrand f{x) is orthogonal to the last 2(K — 1) null-rules 29 . 

For the non-linear error estimates discussed in Section 3, the error estimates will 
fail under similar circumstances: In de Boor's CADRE (see Section 3.1), it is sufficient 
that the difference between two neighboring entries in the T-table is zero for the 
error estimate to fail. For a T-table of depth £, this engenders 0(£ 2 /2) different 
polynomials to which the integrand may be orthogonal to for the error estimate to 
fail. 

In the case of Rowland and Varol's or Venter and Laurie's error estimates (see 
Section 3.2), a difference of zero between two consecutive pairs of rules is sufficient 
for the error estimate to fail and thus, as for the simple error estimators discussed 
above, for a sequence of m rules, there arc m— 1 polynomials to which an integrand 
f(x) may be orthogonal to for which the error estimator will always fail. 

(2) (2) (2) (1) 

In Laurie's error estimate (see Section 3.3), either Q a — or 

Qa Qa 

need to be zero for the estimate to fail, resulting in two polynomials to which the 
integrand may be orthogonal to for the error estimate to fail. Similarly, for Favati 
et al. 's error estimate (see Section 3.3), there are three such polynomials. 



28 The r/i are, incidentally, the weights of a null rule, such as they are constructed by Lyness [1965]. 
29 In Berntsen and Espclid's original error estimate 2 null-rules are used to compute each from 
which the K ratios (see (21)) are computed. It is, however, only necessary that the nominators 
of the ratios be zero, hence only 2(K — 1) null-rules need to be zero for the estimate to be zero. 
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Finally, for dc Doncker's error estimate (see Section 3.4), the case is some- 
what more complicated due to the global approach of the algorithm. Since it 
uses, locally, Piessens et al. 's local error estimate (see Section 2.4), it will fail 
whenever this estimate fails, making it vulnerable to the same family of inte- 
grands. Additionally, it will fail whenever the difference between two global es- 
timates Qn [a, b] — Q^™ -1 -* [a, b] accidentally becomes zero, causing the algorithm 
to fail globally. 

For both new error estimates presented here ((69) and (76)), the matter is a bit 
more complicated. Given two interpolations <?^_i(x) and 9„ —ii x )i with n 2 > fix, 
over the nodes x\ , i = 1 . . . n% — 1 and x\ , i = 1 . . . 712 — 1 respectively, we define 
the joint set of n u nodes 

x (u) = X (1)\J X (2) w hich we will use for the product in (81). 
Given the inverse Vandermonde-like matrices U^ 1 -* = (PW) -1 and U' 2 ) = (P' 2 )) -1 
of size fii x ri\ and n-i x ni used to compute the coefficients of gn}{x) and gn](x), 
we can stretch them to size 712 x n u such that 

c* 1 ) = uWfM, c (2) = U( 2 )f(") 

where U' 1 ' and are the stretched matrices and f( u > contains the integrand 

evaluated at the joint set of nodes x^ u \ For the error estimate ||cW — c^ 2 ^|j to be 
zero, f must lie in the null-space of the n.2 x n u matrix 

Xj(«) = [u(i) _ u(2) 

which has rank r u equal to the smaller of the number of nodes not shared by 
both .tW and x^ 2 \ i.e. x^ u '\{x^' H x^} or n 2 . For the error estimate to be 
zero, the product Tj( u )f( u ) must be zero. This is the case when the integrand 
f{x) is of algebraic degree > 712 and orthogonal to the r u polynomials generated 
by interpolating the values of the first r u rows of TJ(") at the nodes x (u l If, 
additionally, the integrand is of degree > n?,, then both error estimates will fail. 

The space of functions that will cause any of the error estimators presented here 
to fail is, in essence, infinite, yet for each type of error estimator, this infinite space 
is subject to different restrictions. For the simple linear error estimators which 
compute a single divided difference, the space is restricted by a single orthogonality 
restriction. In the case of error estimators such as O'Hara and Smith's or Berntsen 
and Espelid's, the space is restricted by three or four 30 orthogonality restrictions. 
Instead of being subject to one or more restrictions, the space of functions that will 
cause the non-linear error estimators discussed in Section 3 to fail is larger than 
that of the simple error estimators, since the integrand needs only to be orthogonal 
to any of a set of polynomials for the algorithm to fail. The set of functions for 
which they will fail is therefore the union of a set of functions, each subject to only 
one restriction. For our new error estimators, the number of restrictions depends 
on the number of nodes used. For the trivial error estimate ((69)), if the nodes 
x^' C :r/ 2 ) and ri2 ~ 2ni (i.e. if Clcnshaw-Curtis or Gauss- Kronrod rule pairs 
are used), the number of restrictions will be ss ^2/2. For the more refined error 
estimate ((76)), if the basic rule does not re-use more than \n/2] of its n nodes in 
each sub- interval, the number of restrictions will be at least n — 1. 



30 In Berntsen and Espelid's original error estimate, a constant K = 3 is used. 
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The new error estimates presented in Section 4 are therefore more reliable since 
the space of functions for which it will fail, albeit infinite, is more restricted than 
that of the other error estimators presented here. It is also interesting to note that 
if we were to increase the degree of the underlying quadrature rules in all our error 
estimates, the number of restrictions to the space of functions for which they will 
fail would not grow, whereas for our new error estimates, the number of restrictions 
grows linearly with the degree of the underlying quadrature rule. 

Two adaptive quadrature algorithms implementing the new error estimates have 
been described and extensively tested in Gonnet [2010]. One of the algorithms pre- 
sented therein has been implemented as cquad in both the GNU Scientific Library 
[Galassi et al. 2009] and as a part of GNU Octave [Eaton 2002]. 
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